skip to main content


Search for: All records

Creators/Authors contains: "Nguyen, Hai Duc"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Serverless (or Function-as-a-Service) compute model enables new applications with dynamic scaling. However, all current Serverless systems are best-effort, and as we prove this means they cannot guarantee hard real-time deadlines, rendering them unsuitable for such real-time applications. We analyze a proposed extension of the Serverless model that adds a guaranteed invocation rate to the serverless model called Real-time Serverless. This approach aims to meet real-time deadlines with dynamically allocated function invocations. We first prove that the Serverless model does not support real-time guarantees. Next, we analyze Real-time Serverless, showing it can guarantee application real-time deadlines for rate-monotonic real-time workloads. Further, we derive bounds on the required invocation rate to meet any set of workload runtimes and periods. Subsequently, we explore an application technique, pre-invocation, and show that it can reduce the required guaranteed invocation rate. We derive bounds for the feasible rate guarantee reduction, and corresponding overhead in wasted compute resources. Finally, we apply the theoretical results to improve the experience quality of a distributed virtual reality/ augmented reality application as well as simplify the application design and resource management. 
    more » « less
    Free, publicly-accessible full text available February 23, 2025
  2. Stream Processing Engines (SPEs) traditionally de-ploy applications on a set of shared workers (e.g., threads, processes, or containers) requiring complex performance man-agement by SPEs and application developers. We explore a new approach that replaces workers with Rate-based Abstract Ma-chines (RBAMs). This allows SPEs to translate stream operations into FaaS invocations, and exploit guaranteed invocation rates to manage performance. This approach enables SPE applications to achieve transparent and predictable performance. We realize the approach in the Storm-RTS system. Exploring 36 stream processing scenarios over 5 different hardware config-urations, we demonstrate several key advantages. First, Storm-RTS provides stable application performance and can enable flexible reconfiguration across cloud resource configurations. Sec-ond, SPEs built on RBAM can be resource-efficient and scalable. Finally, Storm-RTS allows the stream-processing paradigm to be extended from the cloud to the edge, using its performance stability to hide edge heterogeneity and resource competition. An experiment with 4 cloud and edge sites over 300 cores shows how Storm-RTS can support flexible reconfiguration and simple high-level declarative policies that optimize resource cost or other criteria. 
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  3. Foster, Ian ; Chard, Kyle ; Babuji, Yadu (Ed.)
    The historical motivation for serverless comes from internet-of-things, smartphone client server, and the objective of simplifying programming (no provisioning) and scale-down (pay-for-use). These applications are generally low-performance best-effort. However, the serverless model enables flexible software architectures suitable for a wide range of applications that demand high-performance and guaranteed performance. We have studied three such applications - scientific data streaming, virtual/augmented reality, and document annotation. We describe how each can be cast in a serverless software architecture and how the application performance requirements translate into high performance requirements (invocation rate, low and predictable latency) for the underlying serverless system implementation. These applications can require invocations rates as high as millions per second (40 MHz) and latency deadlines below a microsecond (300 ns), and furthermore require performance predictability. All of these capabilities are far in excess of today's commercial serverless offerings and represent interesting research challenges. 
    more » « less
  4. Today's serverless provides "function-as-a-service" with dynamic scaling and fine-grained resource charging, enabling new cloud applications. Serverless functions are invoked as a best-effort service. We propose an extension to serverless, called real-time serverless that provides an invocation rate guarantee, a service-level objective (SLO) specified by the application, and delivered by the underlying implementation. Real-time serverless allows applications to guarantee real-time performance. We study real-time serverless behavior analytically and empirically to characterize its ability to support bursty, real-time cloud and edge applications efficiently. Finally, we use a case study, traffic monitoring, to illustrate the use and benefits of real-time serverless, on our prototype implementation. 
    more » « less